Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

complete workflow to check sparql queries #396

Merged
merged 65 commits into from
Oct 19, 2024

Conversation

DeleMike
Copy link
Contributor

Contributor checklist


Description

Adds steps in workflow file to run check_query_identifiers.py

Related issue

Copy link

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you!

Maintainer checklist

  • The linting and formatting workflow within the PR checks do not indicate new errors in the files changed

  • The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

@DeleMike
Copy link
Contributor Author

@andrewtavis, I've finally made the workflow file to work! (omg!!)

but how should the expected behaviour be like? For now, it only logs to the console that the files that have QID issues but the workflow does not actually fail so all green checks.

How do we wanna approach this? Should the workflow fail? is there a way we could notify the PR author that something is wrong?

@andrewtavis
Copy link
Member

The workflow should fail :) A big thing is that it's also a tool for maintainers to know if things are wrong in a PR, and and a failed workflow says that loud and clear. We were discussing warnings for GitHub actions in another issue, which apparently are a thing, but let's keep it all to if there's a problem fail the workflow.

@andrewtavis andrewtavis added the hacktoberfest-accepted Accepted as a part of Hacktoberfest label Oct 17, 2024
@andrewtavis andrewtavis self-requested a review October 17, 2024 07:29
@DeleMike
Copy link
Contributor Author

Okay, I'll figure this out. Thanks for the update @andrewtavis!

@DeleMike
Copy link
Contributor Author

@andrewtavis I have updated the codebase to cause the workflow to fail once we have invalid queries.

with this:

# Exit with an error code if any incorrect QIDs are found
if incorrect_languages or incorrect_data_types:
    sys.exit(1)

I believe with this, all checks are complete

@andrewtavis
Copy link
Member

Quick note being sent to all the testing PRs, if updates are needed now that #402 has been merged, then it'd be great to get those updates to the branch :) If no updates are needed, then let me know 😊

@DeleMike
Copy link
Contributor Author

Quick note being sent to all the testing PRs, if updates are needed now that #402 has been merged, then it'd be great to get those updates to the branch :) If no updates are needed, then let me know 😊

checking...

OmarAI2003 and others added 21 commits October 18, 2024 09:08
- Utilized already built helper functions to support sub-languages when retrieving ISO and QID values.
- Updated table printing to correctly format and display both main languages and sub-languages.
…tion to reflect the new JSON structure, ensuring only data types are printed and no sub-languages unlike before.
…_name' to align with the directory structure in the language_data_extraction directory.
…se list_all_languages, assigning a complete list of all languages.
- Updated all test cases to account for sub-languages.
- Removed tests for 	est_get_language_words_to_remove and 	est_get_language_words_to_ignore, as these functions were deleted from utils.py and the languages metadata files
…. Made the language_metadata parameter optional in two functions. Added a ValueError exception when a language is not found.
- Positive and negative tests for format_sublanguage_name
- Test to validate the output of list_all_languages
…ON structure

- Updated the logic for building language_map and language_to_qid to handle languages with sub-languages.
- Both main languages and sub-languages are now processed in a single pass, ensuring that:
  - language_map includes all metadata for main and sub-languages.
  - language_to_qid correctly maps both main and sub-languages to their QIDs.
…ON structure

- Updated the logic for building language_map and language_to_qid to handle languages with sub-languages.
- Both main languages and sub-languages are now processed in a single pass, ensuring that:
  - language_map includes all metadata for main and sub-languages.
  - language_to_qid correctly maps both main and sub-languages to their QIDs.
…ON structure

- Updated the logic for building language_map and language_to_qid to handle languages with sub-languages.
- Both main languages and sub-languages are now processed in a single pass, ensuring that:
  - language_map includes all metadata for main and sub-languages.
  - language_to_qid correctly maps both main and sub-languages to their QIDs.
@DeleMike
Copy link
Contributor Author

DeleMike commented Oct 18, 2024

Hi @andrewtavis I have adjusted this PR to accommodate the changes from #402 and as you can see a workflow is failing. And that is the one to Check Query Identifiers.

For now, these are the files causing the issue (you can check the workflow message):

Incorrect Language QIDs found in the following files:
- /home/runner/work/Scribe-Data/Scribe-Data/src/scribe_data/language_data_extraction/Igbo/verbs/query_verbs.sparql
- /home/runner/work/Scribe-Data/Scribe-Data/src/scribe_data/language_data_extraction/Korean/verbs/query_verbs.sparql
- /home/runner/work/Scribe-Data/Scribe-Data/src/scribe_data/language_data_extraction/Korean/postpositions/query_postpositions.sparql
- /home/runner/work/Scribe-Data/Scribe-Data/src/scribe_data/language_data_extraction/Korean/adverbs/query_adverbs.sparql

They are failing because they do not exist in the language_metadata.json. Do I update the json file so that this workflow passes?

@DeleMike
Copy link
Contributor Author

hi @andrewtavis 👋🏾
I have updated the language_metadata.json to support the Korean and Igbo languages.

I also updated the test cases to ensure we expect these languages

Now all test cases are passing :)

We can say that all QIDs are valid. Specifically, language QIDs and data type QIDs

Copy link
Member

@andrewtavis andrewtavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work here, @DeleMike 😊 Appreciate your dedication to getting these tests up and running! 🚀

@andrewtavis andrewtavis merged commit 9d5c37c into scribe-org:main Oct 19, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted Accepted as a part of Hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add workflow to check queries
6 participants